Prediction of cis/trans isomerization using feature selection and support vector machines
نویسندگان
چکیده
In protein structures the peptide bond is found to be in trans conformation in the majority of the cases. Only a small fraction of peptide bonds in proteins is reported to be in cis conformation. Most of these instances (>90%) occur when the peptide bond is an imide (X-Pro) rather than an amide bond (X-nonPro). Due to the implication of cis/trans isomerization in many biologically significant processes, the accurate prediction of the peptide bond conformation is of high interest. In this study, we evaluate the effect of a wide range of features, towards the reliable prediction of both proline and non-proline cis/trans isomerization. We use evolutionary profiles, secondary structure information, real-valued solvent accessibility predictions for each amino acid and the physicochemical properties of the surrounding residues. We also explore the predictive impact of a modified feature vector, which consists of condensed position-specific scoring matrices (PSSMX), secondary structure and solvent accessibility. The best discriminating ability is achieved using the first feature vector combined with a wrapper feature selection algorithm and a support vector machine (SVM). The proposed method results in 70% accuracy, 75% sensitivity and 71% positive predictive value (PPV) in the prediction of the peptide bond conformation between any two amino acids. The output of the feature selection stage is investigated in order to identify discriminatory features as well as the contribution of each neighboring residue in the formation of the peptide bond, thus, advancing our knowledge towards cis/trans isomerization.
منابع مشابه
A Comparative Study of Extreme Learning Machines and Support Vector Machines in Prediction of Sediment Transport in Open Channels
The limiting velocity in open channels to prevent long-term sedimentation is predicted in this paper using a powerful soft computing technique known as Extreme Learning Machines (ELM). The ELM is a single Layer Feed-forward Neural Network (SLFNN) with a high level of training speed. The dimensionless parameter of limiting velocity which is known as the densimetric Froude number (Fr) is predicte...
متن کاملAnomaly Detection Using SVM as Classifier and Decision Tree for Optimizing Feature Vectors
Abstract- With the advancement and development of computer network technologies, the way for intruders has become smoother; therefore, to detect threats and attacks, the importance of intrusion detection systems (IDS) as one of the key elements of security is increasing. One of the challenges of intrusion detection systems is managing of the large amount of network traffic features. Removing un...
متن کاملSeparating Well Log Data to Train Support Vector Machines for Lithology Prediction in a Heterogeneous Carbonate Reservoir
The prediction of lithology is necessary in all areas of petroleum engineering. This means that to design a project in any branch of petroleum engineering, the lithology must be well known. Support vector machines (SVM’s) use an analytical approach to classification based on statistical learning theory, the principles of structural risk minimization, and empirical risk minimization. In this res...
متن کاملFeature Selection Using Multi Objective Genetic Algorithm with Support Vector Machine
Different approaches have been proposed for feature selection to obtain suitable features subset among all features. These methods search feature space for feature subsets which satisfies some criteria or optimizes several objective functions. The objective functions are divided into two main groups: filter and wrapper methods. In filter methods, features subsets are selected due to some measu...
متن کاملStudy of Cis–trans Isomerization Mechanism of [3-(3-Aminomethyl) Phenylazo] Phenyl Acetic Acid as a Causative Role in Alzheimer Using Density Functional Theory
Amyloid-β (Aβ) self-assembly into cross-β amyloidfibrils is implicated in a causative role in Alzheimer’s disease pathology.Uncertainties persist regarding the mechanisms of amyloid self assembly and the role of metastable prefibrillar aggregates. Aβ fibrilsfeature a sheet-turn-sheet motif in the constituent β-strands; as such, turn nucleation has been proposed as a rate-limiting step in the se...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of biomedical informatics
دوره 42 1 شماره
صفحات -
تاریخ انتشار 2009